Skip to content

Non-record: Entropy-aware Int5-odd BTT-MLP with materialized eval#1223

Open
LucasErcolano wants to merge 1 commit intoopenai:mainfrom
LucasErcolano:codex/entropy-aware-int5odd-bttmlp
Open

Non-record: Entropy-aware Int5-odd BTT-MLP with materialized eval#1223
LucasErcolano wants to merge 1 commit intoopenai:mainfrom
LucasErcolano:codex/entropy-aware-int5odd-bttmlp

Conversation

@LucasErcolano
Copy link
Copy Markdown

Summary

This PR adds a non-record submission package for an entropy-aware int5_odd export objective combined with a TT/BTT-inspired structured MLP and an evaluation-time materialization path.

Main pieces included in the package:

  • entropy-aware training on a 5-bin odd grid {-2,-1,0,1,2} aligned to the exported int5_odd + zlib artifact
  • structured MLP matrices via a 2-core TT/BTT-style StructuredLinear in the MLP while keeping attention dense
  • evaluation-time materialization of the structured MLP into dense weights so validation runs through standard F.linear
  • local and cloud helper scripts, plus the logs referenced by the README

Main Result

Recovered 1x H100 80GB HBM3 result (non-record, VAL_TOKEN_LIMIT=1048576):

  • params: 25,727,104
  • pre-quant val_bpb: 5.3457
  • roundtrip val_bpb: 5.88802157
  • quantized size: 5,184,543 bytes
  • total package size: 5,267,667 bytes
  • peak memory: 50290 MiB allocated / 78558 MiB reserved
  • optimizer-step training time: 603.522 s for 31 steps

Systems Note

The key engineering win here is the eval-time materialization hook for the structured MLP. On the 1,048,576-token validation cap, the materialized path reduced eval time from roughly 96.99 s to 0.851 s, which is what made this structured stack operationally usable on H100 for evaluation.

Validation

  • py -3.11 -m py_compile records/track_non_record_16mb/2026-03-31_EntropyAware_Int5Odd_BTTMLP_3060/train_gpt.py
  • package includes the logs referenced by the README for compile, init, lambda, scout, local smoke, and H100 runs

Scope

This is intentionally a non-record exploratory PR. The current package is anchored to the best fully recovered 1xH100 run and not a full-validation leaderboard submission.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant